A Clustering Algorithm for Discovering Varied Density Clusters

نویسنده

  • Ahmed M. Fahim
چکیده

---------------------------------------------------------------------***--------------------------------------------------------------------Abstract Spatial data clustering is one of the important data mining techniques for extracting knowledge from large amount of spatial data collected in various applications, such as remote sensing, GIS, computer cartography, environmental assessment and planning, etc. many useful spatial data clustering algorithms have been proposed. DBSCAN is the most popular density clustering algorithm, which does not limit itself to shapes of clusters and handles the noise effectively. However, DBSCAN has a trouble in finding out all clusters from datasets with varied densities, because it depends on a globular value for its parameter Eps. This paper presents enhanced DBSCAN which clusters spatial databases that contain clusters of varying densities effectively. The idea is to allow varied values for the Eps parameter according to the local density of the starting point in each cluster. The clustering process starts from the highest local density point towards the lowest local density one. And the value of Eps varies according to the local density of the initial point in current cluster. For each value of Eps, DBSCAN is adopted to make sure that all density reachable points with respect to current Eps are clustered. At the next process, the clustered points are ignored, to avoid merging among denser clusters with sparser ones. Varied synthetic datasets in 2-dimension are used to evaluate the efficiency of the proposed enhanced DBSCAN algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

Improvement of density-based clustering algorithm using modifying the density definitions and input parameter

Clustering is one of the main tasks in data mining, which means grouping similar samples. In general, there is a wide variety of clustering algorithms. One of these categories is density-based clustering. Various algorithms have been proposed for this method; one of the most widely used algorithms called DBSCAN. DBSCAN can identify clusters of different shapes in the dataset and automatically i...

متن کامل

بررسی مشکلات الگوریتم خوشه بندی DBSCAN و مروری بر بهبودهای ارائه‌شده برای آن

Clustering is an important knowledge discovery technique in the database. Density-based clustering algorithms are one of the main methods for clustering in data mining. These algorithms have some special features including being independent from the shape of the clusters, highly understandable and ease of use. DBSCAN is a base algorithm for density-based clustering algorithms. DBSCAN is able to...

متن کامل

A Density Based Algorithm for Discovering Density Varied Clusters in Large Spatial Databases

DBSCAN is a base algorithm for density based clustering. It can detect the clusters of different shapes and sizes from the large amount of data which contains noise and outliers. However, it is fail to handle the local density variation that exists within the cluster. In this paper, we propose a density varied DBSCAN algorithm which is capable to handle local density variation within the cluste...

متن کامل

Refining membership degrees obtained from fuzzy C-means by re-fuzzification

Fuzzy C-mean (FCM) is the most well-known and widely-used fuzzy clustering algorithm. However, one of the weaknesses of the FCM is the way it assigns membership degrees to data which is based on the distance to the cluster centers. Unfortunately, the membership degrees are determined without considering the shape and density of the clusters. In this paper, we propose an algorithm which takes th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015